


### Overview of Replication Archive

This repository contains the code and data needed to replicate results in Gonzalez-Rostani (2024) "Elections, Right-wing Populism, and Political-Economic Polarization: The Role of Institutions and Political Outsiders." The structure of the repository is described below. All of the replication instructions assume that you set the working directory to the same folder as the `README.md` or `README.html` file. 

The analysis was conducted on a personal computer for which the information is shown below. 

```
Windows 11 
R version: 4.3.1 
Stata version: 17
Python version: 3.8.17

```

The files in the replication archive are described below.

- `Master_ht.rmd`: This file contains all merged codes to create all of the tables and figures used in the main text and appendix (code to generate html). An html file printing main analysis results is provided `Master_ht.html`. Similarly `Master_pdf.rmd` presents the same code but allows to generate the pdf file visualization `Master_pdf.pdf` Note these files (pdf and html) only echoes analysis code, but all code is in the file running and can be seen opening the rmd. 

- `Master.do`: This file runs all of do-file which creates most of the figures and text (with the exception of the analysis of NMF made with Python). A log is provided the `log` folder.

- `do`: This directory contains all the code used in the analysis. The files are categorized based on their relevance to different sections of the study:
    - Description of the files' name criteria
        - Files starting with `1_` correspond to analyses related to the **Vote-switching across Institutions** section.
        - Files starting with `2_` pertain to the **Messaging and Targeting Strategies: Candidate Rhetoric and Party Platforms** section, specifically focusing on **Majoritarian Systems** (e.g., US analyses).
        - Files starting with `3_` also relate to the **Messaging and Targeting Strategies** section but focus on **PRITM Systems**.
        - Files starting with `4_` contain code for **Figures that provide context and additional details** but do not directly reflect results.
  - `4_1_Figures_ISSP.do`: This do file runs the motivation figures and appendix figures below.  
    - Figure 1: Relative Share of Labor Force 1995 to 2014
    - Figure 2: Electoral consequences, Routine and Non-Routine Voters
    - Figure A3: Importance of job security, Difficulties to find a new job, Concerns about losing the job, and Job dissatisfaction
  - `1_1_Switching_US.do`: This file produces the analysis for the section on vote-switching across institutions for the US. 
    - Figure 3: The effect of exposure to automation on vote-switching. [US Part]
    - Table A3: Switching Vote, IV - RTI, US
    - Table A5: Switching Vote (alternative definition), IV - RTI
    - Table A6: Switching Vote, IV - Routine (dummy), US
    - Table A1: Descriptive statistic: USA GSS 2016 vs 2012
  - `1_2_Switching_Germany.do`:  This file produces the analysis for the section on vote-switching across institutions for Germany. Since it creates Figure 3 merging with the US, it should be run after this figure is generated in 1_1_Switching_US.do
    - Figure 3: The effect of exposure to automation on vote-switching. [German Part]
    - Table A4: Switching Vote (Only left) - Germany, IV - RTI
    - Table A7: Switching Vote From Establishment Left and Right to Populist Right, IV - RTI,
German
    - Table A8: Switching Vote, IV - Routine (dummy), Germany
    - Table A9: Switching Vote (Only from the Right), IV - RTI
    - Table A2: Descriptive statistic: Germany SOEP 2014 vs 2018
  - `1_3_Switching_Appendix_SpoonKluver.do`:  This file produces appendix table with additional context of switching in Germany
    - Table A10: Switching in Germany from mainstream to non-mainstream parties 2002-2009
  - `1_4_Switching_Appendix_ESS.do`:  This file produces the additional analysis of switching that is cross-sectional based on the ESS and appears in the Appendix. 
    - Table A11: Switching in Western European Countries from Mainstream Left to Outsider Radical Right parties 2002-2018
  - `2_3_Speech_US_Germany_Appendix_NMF.ipynb`: This files generates the topic Analysis in US and Germany that will be the input for Table A17. Note that `Table A17` is not automatically generated but the words for the topic analyses are generated here, as well as the share of the topics. 
    - Table A17: NMF Topic Modeling, 4 clusters, top-10 terms.
  - `2_1_Rally_US.do`: This file creates the tables associated with the analysis of Trump’s campaign strategies (rallies) in the US
    - Table 1: Trump’s Campaign Strategy
    - Table A13: Trump’s Campaign Strategy (Close election 10)
    - Table A14: Trump’s Campaign Strategy (Forecasting 2016)
    - Table A12: Summary statistics of variables used in this study about Trump’s campaign
strategies: rallies
  - `2_0_Speech_US_dictionaries.ipynb`: This file read the speeches, and generate the indicators that will be used as input in  file `2_2_Speech_US.do`. If you want to focus directly on the analysis you can skip this file and use `2_2_Speech_US.do` which calls the data generated in this jupyter notebook. 
  - `2_2_Speech_US.do`: This file provides the analysis of Trump’s campaign speeches, make sure before running of having the inputs (these inputs are generated in `2_0_Speech_US_dictionaries.ipynb`)
    - Table 2: Trump’s Campaign Strategy: Speeches
    - Table A15: Trump’s Campaign Strategy: Speeches (Total count)
  - `3_2_CMP_PRITM.do`: This file creates tables for the section on PRITM systems with cross-sectional evidence from the CMP and with the main proxy for partisan polarization over redistribution and fixed attributes that appears in the main text (distance between establishment and radical right). This proxy is also used for Appendix tables modifying the cut-off point and alternative proxies for the definition of fixed attributes. 
    - Table 3: PRITM: Partisan Polarization over Redistribution and Fixed Attributes
    - Table A19: Partisan Polarization over Redistribution and Fixed Attributes Different Cut-Off
    - Table A20: Alternative measures of Partisan Polarization over Fixed Attributes between Mainstream Left and Right-Populist
    - Table A18: Descriptive statistics: PRITM 1970-2019
  - `3_3_CMP_PRITM_Appendix_Average.do`: This file produces tables from the Appendix with an alternative definition for partisan polarization- Average Distance in the party system
    - Table A21: Partisan Polarization over Redistribution and Fixed Attributes
  - `3_4_CMP_PRITM_Appendix_Dalton.do`: This file produces tables from the Appendix with an alternative definition for partisan polarization, it uses Dalton Index
    - Table A22: Partisan Polarization over Redistribution and Fixed Attributes, Dalton Index
  - `3_5_CMP_PRITM_Appendix_ER.do`: This file produces tables from the Appendix with an alternative definition for partisan polarization- - ER
    - Table A23: Partisan Polarization over Redistribution and Fixed Attributes
  - `3_0_Regional_Germany_HateIncidents.rmd` This script creates the subset from ARVIG data that will be used for the analysis of targeting strategies in Germany related to hate incidents across regions. You can skip this file and run directly  `3_1_Regional_Germany.do` with the selected period (year before election).
  - `3_1_Regional_Germany.do`: This file contains the main results for the targeting strategies sections, and in particular the electoral performance of the AfD across districts in Germany. Note that it requires the file for hate incidents generated in `3_0_Regional_Germany_HateIncidents.rmd`
    - Table 4: AfD Performance
    - Table A16: Summary statistics of variables used in this study about AfD regional performance
  - `4_2_Figures_Appendix_ESS.do`: This file produces additional figures for the appendix.
    - Figure A2: Share routine and non-routine 2002-2018
  - `4_3_Figures_Appendix_CHES.do`: This file produces additional figures for the appendix.
    - Figure A4: Number of Radical Right Parties in the Party System
  - `4_4_Figures_Appendix_CMP.do`: This file produces additional figures for the appendix.
    - Figure A5: Number of Nationalist Parties in Elections
  - `4_5_Figures_Appendix_StockRobots.do`: This file produces additional figures for the appendix.
    - Figure A1: Stock of robots per thousand of workers base 1993
 



- `data`: This directory contains the publicly available data.
  - The files in this folder are an easily accessible formatted version of data to be used in the analysis. Some of these data are used directly from there, other of these files are generated from the codes described before. First I will describes the folders inside `data` and the files in each of them, then the files ready to use in `data`. The codebook on these database can be seen in `Codebook.rmd` or `Codebook.html`. 
  - `CMP`: This folder contains data from the CMP, accessed in 2020. This file is the input for `4_4_Figures_Appendix_CMP.do`, `3_2_CMP_PRITM.do`,  `3_3_CMP_PRITM_Appendix_Average.do`, `3_4_CMP_PRITM_Appendix_Dalton.do`, and `3_5_CMP_PRITM_Appendix_ER.do`.
    - `MPDataset_MPDS2020a_stata14.dta`
  - `Region Germany`: This folder contains data to conduct the regional analysis of Germany. These are the input used in `3_1_Regional_Germany.do`
    - `btw2017kreis (3).csv`: This file contains data on the electoral results from Germany accessed from https://www.bundeswahlleiter.de/
    - `RegionEntries14.dta`: This file contains replication data from  "Trade and Manufacturing Jobs in Germany" By Wolfgang Dauth, Sebastian Findeisen, and Jens Suedekum
    - `final_aggregated_data.dta`: This files contains the subset of hate incidents from ARVIG database for the period of interest (year before the election). This file can be generated running the script in `do` called `3_0_Regional_Germany_HateIncidents.rmd`
  - `Switching`: This folder contains data to conduct the switching analysis from the US before being processed. 
    - `GSS7218_R3.dta`: This file contains data from GSS accessed in April 2020. 
  - `Text`: This directory contains data collected for the text analysis of the US. These files will be used in `2_0_Speech_US_dictionaries.ipynb` to create part of the inputs to be used in the analysis conducted in `2_2_Speech_US.do`
    - `Presidential` This folder contains the txt file of the speeches obtained from the UCSB presidential project. 
      - `X.txt`: Several txt files, each one of these contains speech from a rally. Date, key location, and state are in the name of the txt file. 
    - `Youtube` This folder contains the txt file of the text obtained from Youtube. They were obtained through the youtube API
      - `X.txt`: Several txt files, each one of these contains speech from a rally obtained from Youtube. Name of the file refers to the Youtube ID.
    - `Rallies_MSA.xls`: This file contains the name of the txt file described in the folder Presidential and Youtube, for the speeches by Metropolitan Statistical area. This file will be the input for `2_0_Speech_US_dictionaries.ipynb`. The main utility of this file is to link the speech identified by its txt file name with a Metropolitan Statistical Area; when the speeches were collected each one was identified with one MSA. This MSAidentification will be later used to link with information on the region. 
    - `combined_df.csv`: This file contains the processed data to be used regarding the speeches. This data is produced by `2_0_Speech_US_dictionaries.ipynb`.  
  - Files ready to use in `data`: 
    - `1999-2019_CHES_dataset_means(v3).dta`: This file contains CHES data, input for `4_3_Figures_Appendix_CHES.do`. 
    - `SpoonKluever_2019_EJPR_PartyConvergence.dta`: This file comes from the Dataverse of Spoon and Kluver 2019 that is used in Appendix for additional context. Refers to file `1_3_Switching_Appendix_SpoonKluver.do`.
    - `CMP_main.dta`: This file contains the variables needed for the main analysis associated with the CMP. This file is generated by the first lines of `3_2_CMP_PRITM.do`, you can load the data directly skipping the first lines of this file. 
    - `CMP_average.dta`: This file contains the variables needed for the analysis in the appendix with the average distance as the proxy for partisan polarization. This file is generated by the first lines of `3_3_CMP_PRITM_Appendix_Average.do`, you can load this data directly skipping the first lines of the file.  
    - `CMP_Dalton.dta`: This file contains the variables needed for the analysis in the appendix with the Dalton index as the proxy for partisan polarization. This file is generated by the first lines of `3_4_CMP_PRITM_Appendix_Dalton.do`, you can load this data directly skipping the first lines of the file.  
    - `CMP_ER.dta`: This file contains the variables needed for the analysis in the appendix with  Esteban and Ray's (1994) definition as the proxy for partisan polarization. This file is generated by the first lines of `3_4_CMP_PRITM_Appendix_ER.do`, you can load this data directly skipping the first lines of the file.  
    - `GSS.dta`: This file contains the variables needed for the analysis on switching vote from the US, this data can be produced with he first lines of `1_1_Switching_US.do`, you can load this data directly skipping the first lines of the file.  
    - `Rally_Visits_MSA.dta`: This file contains prepared data to run the analysis on `2_1_Rally_US.do` and is the input for `2_2_Speech_US.do` which adds information about MSA to the speeches measures. The data refer to employment regional data from Muro, Maxim, and Whiton’s 2019 replication data, own data collected for the number of rallies by MSA (created from the list of rallies in Wikipedia, accessed March 2023 and revised against news, UCSB presidential project and YouTube videos), and data on hate incidents prepared from the ADL Center on Extremism, which documents incidents reported in their tools to track hate (I coded this data into MSA).
    - `Regional_Germany.dta`: This file contains the variables needed for the targeting regional analysis on Germany, this data can be produced with the first lines of `3_1_Regional_Germany.do`, you can load this data directly skipping the first lines of the file.  
    - `SOEP.dta`: This file contains the variables needed for the analysis on switching vote from Germany, this data can be produced with the first lines of `1_1_Switching_US.do`, you can load this data directly skipping the first lines of the file. The raw data was accessed under contract and for researcher to obtain it should follow the same steps (i.e, it is not provided in this dataverse). 
    - `Speech_MSA.dta`: This file contains the variables needed for the analysis on speeches in the US, this data can be produced with he first lines of `2_2_Speech_US.do`, you can load this data directly skipping the first lines of the file.  
    - `Figures_ISSP.dta`: This file contains the variables needed for the creation of Figures from the ISSP; this data is used on the file `4_1_Figures_ISSP.do`. 
    - `Appendix_ESS.dta`: This contains the data needed for robustness analysis conducted on `1_4_Switching_Appendix_ESS.do` and appendix Figures `4_2_Figures_Appendix_ESS.do`
    - `filtered_papers.csv`: This csv file contains the merged text from all US speeches; it is the input for the topic analysis conducted in `2_3_Speech_US_Germany_Appendix_NMF.ipynb`
    - `filtered_papers_G.csv`: This csv file contains the merged text from AfD manifesto; it is the input for the topic analysis conducted in `2_3_Speech_US_Germany_Appendix_NMF.ipynb`
    - `reproducingacemoglu.csv`: This file contains the variables needed for the creation of Appendix Figure  with data from Acemoglu & Restrepo on the number of industrial robots, used in `4_5_Figures_Appendix_StockRobots.do`. 
  - `not_for_dataverse`: This folder is referenced in some of the code and contains files, listed below, that cannot be shared publicly (contract needed).
    - `pgen.dta`: This file contains data from SOEP accessed in January 2021 through signed contract. 


- `Codebook.rmd` & `Codebook.html`: This is the codebook containing information on the relevant variables in all the data used in the analysis.
    

- `Figure`: This directory contains all of the figures used in the manuscript.

- `Table`: This directory contains all of the tables in the manuscript.


- `log`: This directory contains the log files from running the various scripts/code files. Some of these are log files from Stata, others are html with code from R or Python. The log called `Master.smcl` refers to the log of all do-files. Directory also contains each code-file log. 



